Multiple
Character Regular Expressions
Single
character regular expressions can be combined into more complex regular
expressions that match more than one character at a time. You can also apply
modifiers to single character regular expressions to make them match more than
one instance of the character or characters specified by the single character
regular expression.
Use the
following rules to construct regular expressions from single character regular
expressions:
A single character regular
expression followed by an asterisk (*) is a regular expression that matches
zero or more consecutive occurrences of the single character regular expression;
always as many as possible. For example, f* matches zero or more consecutive
f characters.
A single character regular
expression followed by \{ m\}, \{ m,\}, or \{ m,n\}
is a regular expression that matches a specified number of occurrences of the
single character regular expression. The values of m and n must
be nonnegative integers less than 255; \{ m\} matches exactly m
occurrences of the single character regular expression; \{ m,\} matches
at least m occurrences; \{ m,n\} matches any number of
occurrences between m and n, inclusive. Whenever a choice exists,
the regular expression matches as many occurrences as possible. For example,
the regular expression [0-9]\{ 1,4\} matches any sequence of 1 to 4 digits.
A concatenation of regular
expressions is a regular expression that matches the concatenation of the
strings matched by each component of the regular expression. For example, the
regular expression [0-9][a-z] matches any string of two characters that
starts with a digit and ends with a lowercase letter.
A regular expression enclosed
between the character sequences \( and \) is a sub-expression that matches
whatever the original regular expression matches. The sub-expression is placed
in a internal, numbered register for use later. The registers are numbered
according to the pairs of \( and \) within the whole regular expression. The
first pair corresponds to register 1. For example, if a string matches the
regular expression \([0-9]*\)\([a-z]*\) , register 1 will contain whatever
sequence of characters matched the regular expression [0-9]* , and register 2
will contain whatever sequence of characters matched the regular expression
[a-z]* . There may be up to 9 sub-expressions in a regular expression.
The expression \n, where
n is a digit from 1 through 9, matches the same string of characters
stored in the internal register number n; that is, the same string of
characters that the sub-expression corresponding to n originally
matched. Regular expressions of the form \n are meaningless unless there
is a sub-expression corresponding to n. For example, the regular
expression \([0-9]\)\1\1 matches any string which consists of the same digit
three times, such as 777 .
A regular expression preceeded
by a caret (^) must match at the beginning of the target string. A regular
expression followed by a dollar sign ($) must match at the end of the target
string. Both may be used in the same regular expression. For example, the
regular expression ^[0-9]*$ only matches strings that consist entirely of
digits.